|
OUTLINE
I. Data Sources
II. Representing Surfaces
III. Methods of Interpolation (point)
IV. Methods of Interpolation (areal)
Definition
· Digital Elevation Model: A model of the continuous variation of relief over a geographic area.
While these surfaces show variation about three data axes (x,y,z), they are not considered 3-D representations which is reserved for data that varies continuously through a 3-D framework (Burrough and McDonnell, 1998)
I) Data Sources
· Stereo aerial photos or satellite images
· Point samples measured directly using GPS using a specific sampling design.
· Digitized topographic maps
II) Representing Surfaces
A) Isolines (contours):
· elevation models can be represented as contours connecting points of equal statistical value. This representation is very suitable for display purposes but is ill suited for numerical analysis or modeling.
B) Mathematical models:
· high degree of complexity, not well suited for general application (fourier analysis, high order polynomials).
C) Point models
· most widely used models suitable for both display and numerical analysis/modeling. Uniform sampling or regular lattice sampling is based on a regular grid . Addaptive sampling, or irregular lattice sampling results when points are collected base on the variability of the surface, the more vaiability the more points collected.
D) Altitude matrix
· surface represented by elevation values at uniform intervals.
· most common representation for a DEM.
· well suited for contour, slope, aspect, shading, basin delineation.
· problems with data redundancy, fixed cell size, cross axis calculations.
E) Triangulated Irregular Network (TIN)
uses sheets of continuous, connected triangular facets based on a Delaunay triangulation of irregularly spaced (generated from progressive sampling) nodes or observations.
· vector topological structure, similar to that for defining polygons.
· altitude and XY in nodes
· reference triangle for other calculations such as slope and aspect.
· avoids redundancy of altitude matrix.
· more efficient calculation of slope.
III) Methods of Interpolation (point)
Interpolation:
· the procedure of estimating the value of properties at unsampled sites within an area covered by existing point observations.
Extrapolation:
· estimating the value of a property at sites outside the area covered by existing observations.
Principle
· points close together in space are more likely to have similar values of a property of interest than points further apart.
Assumption
· The assumption in using one interpolation procedure verses another, as they relate to the data being interpolated, must be clearly stated. Does the interpolation procedure make sense? What factors must be considered when performing interpolations?
The Challenge of Interpolation
Finding a plausible model to suit the phenomena being modeled.
A) Discrete Techniques
1) Thiessen Polygons - use closest samples to determine the value of a given point by taking the perpendicular bisectors between points. Problem: size and shape of areas depend on sampling design, value of interest is estimated by a sample of one, and distance from point not factored into the interpolation.
B) Continuous Methods
1) Global Methods
Trend surface analysis:
· Polynomial fit of points to approximate the surface. Method is often used to remove broad features prior to using some other local interpolator.
· Example:
|
1st degree |
2nd degree |
3rd degree |
1 Ind. variable |
y = bo + b1X1 |
y = bo + b1X1 + b2X21 |
y = bo + b1X1 + b2X21+ b3X31 |
2 Ind. variable |
y = bo + b1X1 + b2X2 |
y = bo + b1X1 + b2X2 + b3X21 + b4X22 + b5X1X2 |
y = bo + b1X1 + b2X2 + b3X21 + b4X22 +b5X1X2 + b6X31 + b7X32 + b8X1X22 + b9X2X12 |
3 Ind. variable |
y = bo + b1X1 + b2X2 + b3X2 + b4X3 |
y = bo + b1X1 + b2X2 + b3X21 + b4X22 + b5X1X2 + b6X3 + b7X23 + b8X1X3+ b9X1X2 |
y = bo + b1X1 + b2X2 + b3X3 b4X21 + b5X22 + b6X23+b5X1X2 + b6X1X3+ b7X2X3 + b8X31 + b9X32 + b10X33 + b+ b13X21X2211X1X22 + b12X2X12 + b13X21X22 + b14X22X12 + ............ |
· Problems : very susceptible to outliers, hard to ascribe physical meaning to higher polynomials.
· Fourier series: linear combination of sine and cosine waves. Better for analysis of periodic functions.
2) Local Interpolators
· Splines: piece-wise polynomial fit which ensure that joins between pieces are smooth. May vary as a function of the location of the break point
· Moving averages: a smoothing technique that computes the average value from a local neighborhood. Can be weighted as a function of distance to give a weighted moving average. Factors to consider:
· size of neighborhood.
· shape of neighborhood.
· minimum number of points.
· location of points.
· shape of weight function.
· z(xo) = Sumi = 1,n(z(xi) * wtij)) / Sumi = 1,n wtij (equation 6.1)
z(xo) =
interpolated value at location j
z(xi) = value at data point
wtij = distance between unknown point xj and data
point xI
å is i = 1 to n, where n is the number of data points
wtij can equal 1 (for an average), or 1/dn
where d is the distance between xi and xj
Problems: maxima and minima can occur only at data points. No built-in method for determining the quality of the interpolation. Duck-egg problem, pattern around solitary points with great difference with surrounding.
3) Kriging (Global & Local):
Optimal method using spatial autocovariance. Based on theory of regionalized variables. Spatial variation is expressed by a structural component associated with a constant mean value or trend, a random, a spatially correlated component, and a random noise or residual error. Semivariogram is used to determine the distance of autocorrelation, its direction and the appropriate weighting scheme. Plots semi-variance with respect to lag (distance).
Z(x) = m(x) + e'(x) + e'' (equation 6.2)
M (x) is a deterministic
function which describes the structural component
e'(x) is the stochastic, locally varying but spatially dependent component
called the regionalized variable
e'' is a residual, spatially independent
Characteristics
· exact or perfect interpolator: surface passes through all points whose values are known.
· provides a measure of statistical error.
· minimizes the variance.
· unbiased estimator.
First step is to decide on the appropriate function for m(x). If it is assumed that there is no trend then the mean value in the sampling area is used.
Calculate e'(x) using the semi-variance g as an estimate.
g (h) = 1/2n Sumi = 1,n
{z(xi) - z(xi - h)}2
n = number of sample points at distance h
z(xi) - z(xi - h)2 = the sum of squares of all
sample points at distance h a part
Variogram which plots g (h) as a function of the Lag(h) can be created.
The variogram provides information about the size of the search window and the shape of the correlation function.
Its relationship to the variance.
4) Knox Methodology
(space-time interaction model)
The Knox analysis consists of pairing the data points and then evaluating whether pairs of data are found close in space and time.
A statistical test determines whether the number of close pairs sufficiently deviates from the number of close space-close time pairs due to a
random process. Knox (1964) suggests the construction of a 2 x 2 contingency table as follows:
Data Points
|
Space |
||
Time |
Close | Not Close | |
Close | Close T(o11) | Time Only (o12) | |
Not Close | Space Only(o21) | Not Close(o22) |
The test statistic T is described as:
where,
n is the number of data points;
s is 1 if the ithjth pair is close in space and zero otherwise;
t is 1 if the ithjth pair is close in time and zero otherwise.
There are:
pairs that can be formed from n data points.
The Knox statistic (T) is tested against an expected number of pairs that would be found close in space and time, given that s pairs
were found close in space and t pairs were found close in time.
The expected number of pairs is calculated under the assumption that space is independent of time. This equation reads:
The approximate variance of the Knox statistic, developed by Barton and David (1966) adjusts for pairs sharing data points and reads:
where,
s: the number of pairs found close in space;
t: the number of pairs found close in time;
s1: the number of pairs close in space sharing one data point;
t1: the number of pairs close in time sharing one data point;
N: the total number of pairs;
n: the number of birds.
Critical Parameters Selection
The selection of critical parameters, referring to values that are considered ‘close’ in space and time is a challenging task in
the Knox methodology.
Running
the Knox Test over a Continuous Surface
A one-half mile grid is overlaid across NYC (global area) and the Knox test is run on the centroid of each grid cell. The use of a buffer
around each centroid ensures an overlapping coverage of NYC. The critical parameters are set and the local significance for each
cell centroid is assessed.
West Nile
Virus Example
IV) Methods of Interpolation (areal)
1) Overlay
· overlay of target and source zones
· determine the proportion of each source zone assigned to each target zone
· apportion the total value of the attribute for each source zone to target zones according to areal proportions.